[POC][RFC] Metals BSP third-party code navigation #21143

jgranstrom · 2024-07-08T18:26:44Z

Note: This is not an intended implementation, simply a PoC of a path to achieving it

A PoC for extended metals (VS Code) support for 3rd-party code navigation.

The underlying issue is that metals requires source-jars. Intellij works around this by decompiling anything that doesn't have it, metals doesn't.

Enable coursier to download source jars for depenencies, and include them in lockfiles
Map between classifiers to find source jars
Provide dependencySources in addition to dependencyModules via BSP (required by metals)

The implementation would probably be better centered around classifiers instead of specifically sources, but I simplified it as much as possible here to get it to a working state.

jgranstrom · 2024-07-08T18:42:27Z

src/python/pants/backend/scala/bsp/rules.py

+    digests = []
+    for cpe in classpath_entries:
+        digests.append(cpe.digest)
+    for alse in applicable_lockfile_source_entries:
+        new_file = FileEntry(alse.file_name, alse.file_digest)
+        digest = await Get(Digest, CreateDigest([new_file]))
+        digests.append(digest)
+
+        for dep in alse.dependencies:
+            coord = Coordinate.from_coord_str(dep.to_coord_str())
+            dep_artifact_requirement = ArtifactRequirement(coord)
+            dep_entry = get_entry_for_coord(lockfile, dep_artifact_requirement.coordinate)
+            dep_new_file = FileEntry(dep_entry.file_name, dep_entry.file_digest)
+            dep_digest = await Get(Digest, CreateDigest([dep_new_file]))
+            digests.append(dep_digest)
+            src_ent = applicable_lockfile_source_entries.get(alse)
+            applicable_lockfile_source_entries_inverse.get(src_ent).append(dep_entry)
+
+    resolve_digest = await Get(Digest, MergeDigests(digests))
+    inverse = dict(zip(classpath_entries, applicable_lockfile_entries))
+
+    s = map(
+        lambda x: (x, applicable_lockfile_source_entries_inverse.get(inverse.get(x))),
+        classpath_entries,
+    )


This whole thing is convoluted just to see that it works, just mashing it in here to avoid changing too much elsewhere

This can also just be a list comprehension.

jgranstrom · 2024-07-08T18:45:30Z

src/python/pants/bsp/util_rules/targets.py

+    dependency_modules = await Get(
+        DependencyModulesResult, DependencyModulesParams, DependencyModulesParams(request.targets)
+    )


This would have its own backend-specific rule I suppose, for now it just pulling it from dependency_modules fully expecting it to be a maven module, again to try it out without implementing a whole rule-indirection for it

Understood. Maybe add a TODO comment as a reminder to create the union hook which makes this backend-agnostic?

This would have its own backend-specific rule I suppose

Yes, because the core BSP logic should not be aware of the per-language specifics.

jgranstrom · 2024-07-08T18:48:04Z

src/python/pants/jvm/resolve/coursier_fetch.py

-        entries = {(i.coord.group, i.coord.artifact): i for i in self.entries}
-        entry = entries.get((coord.group, coord.artifact))
+        entries = {(i.coord.group, i.coord.artifact, i.coord.classifier): i for i in self.entries}
+        entry = entries.get((coord.group, coord.artifact, coord.classifier))


Had to make sure everything includes classifier as group, artifact might not be unique anymore

jgranstrom · 2024-07-08T18:51:41Z

src/python/pants/jvm/resolve/coursier_setup.py

+            if ":jar:sources:" in dep['coord']:
+                path = os.path.join(os.getcwd(), sys.argv[1] + ".empty")
+                f = open(path, "w")
+                f.write("")
+                f.close()
+                source = PurePath(path)
+                dest_name = dep['coord'].replace(":", "_")
+                classpath_dest = f"classpath/{dest_name}{ext}"
+                existing_source = classpath.get(classpath_dest)
+                if existing_source:
+                    if existing_source == source:
+                    # We've already captured this file.
+                        continue
+                    raise Exception(
+                        f"Duplicate jar name {classpath_dest} with incompatible source:\\n"
+                        f"  {source}\\n"
+                        f"  {existing_source}\\n"
+                    )
+                classpath[classpath_dest] = source
+                copyfile(source, classpath_dest)
+                continue
+            else:


This is literally just a copy of the normal action but just passing an empty file for missing source jars (super-hacky yes). I noticed things upstream expect results to match 1-to-1 and in order, I just did this hack instead of finding every place where that was assumed. Since source jars aren't always a thing they'd have to be allowed to be not found

For my education, do you have an example of how this setup script fails without this change?

jgranstrom · 2024-07-08T18:53:17Z

src/python/pants/jvm/resolve/coursier_setup.py

+        --default=true \
+        --sources \


Here's where I'm thinking it should be configurable to have coursier fetch source-jars / javadoc-jars etc. Currently I just add source-jars on top of the normal ones in this PoC

So javadoc could be an additional type of jar added to the resolve?

jgranstrom · 2024-07-08T18:54:37Z

src/python/pants/jvm/resolve/lockfile_metadata.py

+        reqs = list(requirements)
+        req_sources = [
+            dataclasses.replace(
+                i, coordinate=dataclasses.replace(i.coordinate, classifier="sources")
+            )
+            for i in reqs
+        ]
+
+        extended = []
+        for i in reqs:
+            extended.append(i)
+        for i in req_sources:
+            extended.append(i)
+
+        return cls(FrozenOrderedSet(sorted(i.to_metadata_str() for i in extended)))


Again if enabling source jars it would be added to the lockfile and lockfile metadata as well

jgranstrom · 2024-07-08T20:39:10Z

src/python/pants/backend/scala/bsp/rules.py

+        digests.append(cpe.digest)
+    for alse in applicable_lockfile_source_entries:
+        new_file = FileEntry(alse.file_name, alse.file_digest)
+        digest = await Get(Digest, CreateDigest([new_file]))


I'm aware all these individual CreateDigest etc would not be done one-by-one in a for-loop, I just left everything as simple as I could to present the overall approach

Then I suggest a TODO comment here as a reminder to just convert this to a single CreateDigest with a list of FileEntry.

tdyas · 2024-07-14T22:40:26Z

src/python/pants/backend/scala/bsp/rules.py

@@ -114,7 +115,7 @@ class ThirdpartyModulesRequest:
 @dataclass(frozen=True)
 class ThirdpartyModules:
    resolve: CoursierResolveKey
-    entries: dict[CoursierLockfileEntry, ClasspathEntry]
+    entries: dict[CoursierLockfileEntry, tuple[ClasspathEntry, list[CoursierLockfileEntry]]]


The inner list should be a tuple instead since "frozen" dataclasses need to be hashable and list is not hashable (since it is mutable).

Also, maybe it would be better to just use a new dataclass instead of a 2-tuple? Then you could give a name to each component which will help with readability.

And is there any particular ordering of these CoursierLockfileEntry instances?

tdyas · 2024-07-14T22:43:06Z

src/python/pants/backend/scala/bsp/rules.py

+            artifact_requirement_source = dataclasses.replace(
+                artifact_requirement,
+                coordinate=dataclasses.replace(
+                    artifact_requirement.coordinate, classifier="sources"


I recall from the PR description, that you had some concern about hard-coding to "sources" classifier. Seems fine to me. I suggest defining a set of constants for well-known classifier names though and using a constant here instead of the open-coded "sources" string.

tdyas · 2024-07-14T22:46:08Z

src/python/pants/backend/scala/bsp/rules.py

+        digests.append(cpe.digest)
+    for alse in applicable_lockfile_source_entries:
+        new_file = FileEntry(alse.file_name, alse.file_digest)
+        digest = await Get(Digest, CreateDigest([new_file]))


Then I suggest a TODO comment here as a reminder to just convert this to a single CreateDigest with a list of FileEntry.

tdyas · 2024-07-14T22:46:48Z

src/python/pants/backend/scala/bsp/rules.py

+            coord = Coordinate.from_coord_str(dep.to_coord_str())
+            dep_artifact_requirement = ArtifactRequirement(coord)
+            dep_entry = get_entry_for_coord(lockfile, dep_artifact_requirement.coordinate)
+            dep_new_file = FileEntry(dep_entry.file_name, dep_entry.file_digest)


Conversion to FileEntry seems to occur more than once. Maybe define a to_file_entry method on the dep_entry?

tdyas · 2024-07-14T22:47:50Z

src/python/pants/backend/scala/bsp/rules.py

+    digests = []
+    for cpe in classpath_entries:
+        digests.append(cpe.digest)
+    for alse in applicable_lockfile_source_entries:
+        new_file = FileEntry(alse.file_name, alse.file_digest)
+        digest = await Get(Digest, CreateDigest([new_file]))
+        digests.append(digest)
+
+        for dep in alse.dependencies:
+            coord = Coordinate.from_coord_str(dep.to_coord_str())
+            dep_artifact_requirement = ArtifactRequirement(coord)
+            dep_entry = get_entry_for_coord(lockfile, dep_artifact_requirement.coordinate)
+            dep_new_file = FileEntry(dep_entry.file_name, dep_entry.file_digest)
+            dep_digest = await Get(Digest, CreateDigest([dep_new_file]))
+            digests.append(dep_digest)
+            src_ent = applicable_lockfile_source_entries.get(alse)
+            applicable_lockfile_source_entries_inverse.get(src_ent).append(dep_entry)
+
+    resolve_digest = await Get(Digest, MergeDigests(digests))
+    inverse = dict(zip(classpath_entries, applicable_lockfile_entries))
+
+    s = map(
+        lambda x: (x, applicable_lockfile_source_entries_inverse.get(inverse.get(x))),
+        classpath_entries,
+    )


This can also just be a list comprehension.

tdyas · 2024-07-14T22:49:36Z

src/python/pants/bsp/util_rules/targets.py

+    dependency_modules = await Get(
+        DependencyModulesResult, DependencyModulesParams, DependencyModulesParams(request.targets)
+    )


Understood. Maybe add a TODO comment as a reminder to create the union hook which makes this backend-agnostic?

tdyas · 2024-07-14T22:50:37Z

src/python/pants/bsp/util_rules/targets.py

+    dependency_modules = await Get(
+        DependencyModulesResult, DependencyModulesParams, DependencyModulesParams(request.targets)
+    )


This would have its own backend-specific rule I suppose

Yes, because the core BSP logic should not be aware of the per-language specifics.

tdyas · 2024-07-14T22:52:02Z

src/python/pants/jvm/resolve/coursier_setup.py

-            )
+            if ":jar:sources:" in dep['coord']:
+                path = os.path.join(os.getcwd(), sys.argv[1] + ".empty")
+                f = open(path, "w")


More idiomatic is to use with open(path, "w") as f: syntax instead.

tdyas · 2024-07-14T22:53:04Z

src/python/pants/jvm/resolve/coursier_setup.py

+            if ":jar:sources:" in dep['coord']:
+                path = os.path.join(os.getcwd(), sys.argv[1] + ".empty")
+                f = open(path, "w")
+                f.write("")
+                f.close()
+                source = PurePath(path)
+                dest_name = dep['coord'].replace(":", "_")
+                classpath_dest = f"classpath/{dest_name}{ext}"
+                existing_source = classpath.get(classpath_dest)
+                if existing_source:
+                    if existing_source == source:
+                    # We've already captured this file.
+                        continue
+                    raise Exception(
+                        f"Duplicate jar name {classpath_dest} with incompatible source:\\n"
+                        f"  {source}\\n"
+                        f"  {existing_source}\\n"
+                    )
+                classpath[classpath_dest] = source
+                copyfile(source, classpath_dest)
+                continue
+            else:


For my education, do you have an example of how this setup script fails without this change?

tdyas · 2024-07-14T22:53:38Z

src/python/pants/jvm/resolve/coursier_setup.py

+        --default=true \
+        --sources \


So javadoc could be an additional type of jar added to the resolve?

wip metals bsp

b288fcb

jgranstrom force-pushed the poc-bsp-metals branch from 960c6ae to b288fcb Compare July 8, 2024 18:40

jgranstrom commented Jul 8, 2024

View reviewed changes

tdyas reviewed Jul 14, 2024

View reviewed changes

tdyas added the category:new feature label Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[POC][RFC] Metals BSP third-party code navigation #21143

[POC][RFC] Metals BSP third-party code navigation #21143

jgranstrom commented Jul 8, 2024 •

edited

Loading

jgranstrom Jul 8, 2024

tdyas Jul 14, 2024

jgranstrom Jul 8, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

jgranstrom Jul 8, 2024

jgranstrom Jul 8, 2024

tdyas Jul 14, 2024

jgranstrom Jul 8, 2024

tdyas Jul 14, 2024

jgranstrom Jul 8, 2024

jgranstrom Jul 8, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

tdyas Jul 14, 2024

[POC][RFC] Metals BSP third-party code navigation #21143

Are you sure you want to change the base?

[POC][RFC] Metals BSP third-party code navigation #21143

Conversation

jgranstrom commented Jul 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgranstrom commented Jul 8, 2024 •

edited

Loading